Impact of Dietary Shifts on Gut Microbiome Dynamics

Multivariate Insights Using R

R for Bio Data Analysis

Group 16: Eric Torres, Lucia de Lamadrid, Konstantina Gkopi, Elena Iriondo and Jorge Santiago

2024-12-03

Introduction

Our aim:

To study the relationship between the composition of the gut microbiota and factors such as diet and colonisation history.

Materials and Methods

General Workflow

MICROBIOME METADATA:

# A tibble: 6 × 6,701
   Diet Source Donor CollectionMet   Sex     OTU0     OTU1     OTU2     OTU3
  <dbl>  <dbl> <dbl>         <dbl> <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1     0      0     0             0     0 1.56e-11 4.72e-11 1.23e-11 4.52e-11
2     0      1     0             0     0 2.36e-11 9.53e-11 3.33e-11 2.67e-11
3     0      2     0             1     0 6.77e-11 3.68e-11 8.02e-11 5.49e-11
4     0      2     0             0     0 5.52e-11 9.89e-11 4.58e-11 3.54e-11
5     0      3     0             0     0 5.24e-11 6.34e-11 2.35e-11 7.47e-11
6     0      4     0             1     0 7.67e-11 7.22e-11 5.41e-11 1.20e-11
# ℹ 6,692 more variables: OTU4 <dbl>, OTU5 <dbl>, OTU6 <dbl>, OTU7 <dbl>,
#   OTU8 <dbl>, OTU9 <dbl>, OTU10 <dbl>, OTU11 <dbl>, OTU12 <dbl>, OTU13 <dbl>,
#   OTU14 <dbl>, OTU15 <dbl>, OTU16 <dbl>, OTU17 <dbl>, OTU18 <dbl>,
#   OTU19 <dbl>, OTU20 <dbl>, OTU21 <dbl>, OTU22 <dbl>, OTU23 <dbl>,
#   OTU24 <dbl>, OTU25 <dbl>, OTU26 <dbl>, OTU27 <dbl>, OTU28 <dbl>,
#   OTU29 <dbl>, OTU30 <dbl>, OTU31 <dbl>, OTU32 <dbl>, OTU33 <dbl>,
#   OTU34 <dbl>, OTU35 <dbl>, OTU36 <dbl>, OTU37 <dbl>, OTU38 <dbl>, …

OTU TAXONOMY GLOSSARY:

  OTU.ID  Kingdom        Phylum         Class           Order
1   OTU0 Bacteria                                            
2   OTU1 Bacteria    Firmicutes    Clostridia   Clostridiales
3   OTU2 Bacteria    Firmicutes       Bacilli Lactobacillales
4   OTU3 Bacteria Bacteroidetes Bacteroidetes   Bacteroidales
5   OTU4 Bacteria Bacteroidetes                              
6   OTU5 Bacteria    Firmicutes    Clostridia   Clostridiales
              Family           Genus X X.1
1                                         
2    Ruminococcaceae                      
3    Enterococcaceae    Enterococcus      
4 Porphyromonadaceae Parabacteroides      
5                                         
6                                         

Data Tidying and Filtering

  • Added a SampleID column to uniquely identify each sample.

  • Transformed the dataset from wide to long format for easier analysis.

  • Keeping OTUs contributing up to 95% of cumulative abundance.

  • Replaced the numeric codes with descriptive labels.

# Creation and relocation of SampleID
metadata_df <- metadata_df |>
  mutate(SampleID = row_number()) |>  # Create SampleID from the first column
  relocate(SampleID, 
           .before = everything())  # Move SampleID to the first position

metadata_df_long <- metadata_df |> 
  pivot_longer(
    cols = starts_with("OTU"), 
    names_to = "OTU", 
    values_to = "rel_abundance"
  )

head(metadata_df_long)

# Calculate cumulative contribution
cumulative_otus <- metadata_df_long |>
  group_by(OTU) |>
  summarize(mean_abundance = mean(rel_abundance)) |>
  arrange(desc(mean_abundance)) |>
  mutate(cumulative_abundance = cumsum(mean_abundance) / sum(mean_abundance))

# Filter OTUs contributing to 95% cumulative abundance
otus_to_keep <- cumulative_otus |>
  filter(cumulative_abundance <= 0.95) |>
  pull(OTU)

# Number of OTUs before filtering
n_total_otus <- metadata_df_long |> 
  pull(OTU) |> 
  n_distinct()

# Number of OTUs after filtering
n_filtered_otus <- filtered_metadata |> 
  pull(OTU) |> 
  n_distinct()

filtered_metadata_stricter_label <- filtered_metadata_stricter |> 
  mutate(Diet = case_when(Diet == 0 ~ "LFPP",
                          Diet == 1 ~ "Western",
                          Diet == 2 ~ "CARBR",
                          Diet == 3 ~ "FATR",
                          Diet == 4 ~ "Suckling",
                          Diet == 5 ~ "Human")) |> 
  mutate(Source = case_when(Source == 0 ~ "Cecum1",
                          Source == 1 ~ "Cecum2", 
                          Source == 2 ~ "Colon1", 
                          Source == 3 ~ "Colon2", 
                          Source == 4 ~ "Feces",
                          Source == 5 ~ "SI1",
                          Source == 6 ~ "SI13", 
                          Source == 7 ~ "SI15", 
                          Source == 8 ~ "SI2", 
                          Source == 9 ~ "SI5",
                          Source == 10 ~ "SI9", 
                          Source == 11 ~ "Stomach", 
                          Source == 12 ~ "Cecum")) |> 
  mutate(Donor = case_when(Donor == 0 ~ "HMouseLFPP",
                          Donor == 1 ~ "CONVR", 
                          Donor == 2 ~ "Human", 
                          Donor == 3 ~ "Fresh", 
                          Donor == 4 ~ "Frozen",
                          Donor == 5 ~ "HMouseWestern", 
                          Donor == 6 ~ "CONVD")) |> 
  mutate(CollectionMet = case_when(CollectionMet == 0 ~ "Contents",
                                   CollectionMet == 1 ~ "Scraping")) |> 
  mutate(Sex = case_when(Sex == 0 ~ "Male",
                         Sex == 1 ~ "Female")) 
head(filtered_metadata_stricter_label)

Now, our data is tidy!

# A tibble: 6 × 8
  SampleID Diet  Source Donor      CollectionMet Sex   OTU   rel_abundance
     <dbl> <chr> <chr>  <chr>      <chr>         <chr> <chr>         <dbl>
1        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU6       3.31e-11
2        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU7       5.08e-11
3        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU9       2.57e- 3
4        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU41      7.95e-11
5        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU58      2.53e-11
6        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU77      1.28e- 3

and ready to be augmented…

We will use the OTUs taxonomy file to add columns with the names of phylum and class for each OTU, using left_join.

clean_df_taxonomy <- clean_df |>  
  left_join(otu_df_modified, 
            join_by(OTU == OTU.ID)) |> 
  relocate(Phylum, Class, .after = OTU) 

head(clean_df_taxonomy)
# A tibble: 6 × 10
  SampleID Diet  Source Donor      CollectionMet Sex   OTU   Phylum     Class   
     <dbl> <chr> <chr>  <chr>      <chr>         <chr> <chr> <chr>      <chr>   
1        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU6  Firmicutes Bacilli 
2        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU7  Firmicutes Clostri…
3        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU9  Firmicutes Clostri…
4        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU41 Firmicutes Bacilli 
5        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU58 Firmicutes Clostri…
6        1 LFPP  Cecum1 HMouseLFPP Contents      Male  OTU77 Firmicutes Clostri…
# ℹ 1 more variable: rel_abundance <dbl>

Results and Discussion

Microbiota composition in terms of phyla in different

  • Sources and Diet Types

  • Diet and Donor Combination

Linear model and identification of OTUs differently associated to diet

diet_phylum_nested <- diet_phylum_nested |> 
  group_by(OTU) |>  
  mutate(model_object = map (.x = data, 
                             .f = ~lm(formula = rel_abundance ~ Diet, 
                                      data = .x)))

After fitting the linear model, we did multiple testing correction and evaluated the statistical significance:

# A tibble: 6 × 7
  OTU    p.value  estimate  conf.low  conf.high  q.value is_significant
  <chr>    <dbl>     <dbl>     <dbl>      <dbl>    <dbl> <chr>         
1 OTU6  2.08e- 2  0.00239   0.000365  0.00441   1   e+ 0 no            
2 OTU7  1.17e- 2 -0.000204 -0.000362 -0.0000455 1   e+ 0 no            
3 OTU9  4.16e- 9 -0.000245 -0.000325 -0.000164  1.10e- 6 yes           
4 OTU41 4.61e-17  0.000460  0.000355  0.000565  1.44e-14 yes           
5 OTU58 2.04e- 1 -0.000174 -0.000442  0.0000948 1   e+ 0 no            
6 OTU77 1.12e- 5  0.000415  0.000231  0.000600  2.46e- 3 yes           

Principal Component Analysis on Phylum-Level Aggregated Microbiome Data

  • The clear separation between green and pink points indicates that the microbiome composition is strongly influenced by diet.

  • Samples from the Western diet have distinct characteristics compared to those from the LFPP diet, as reflected in their separation along the principal components.

Principal Component Analysis on Phylum-Level Aggregated Microbiome Data

  • The PCA variance table shows that PC1 and PC2 together explain ~55 of the total variance. Including PC3, PC4 and PC5 increases the cumulative explained variance to 90%, capturing most of the dataset’s variability

  • Western diet correlates with a negative PC1 coordinate, which is in line with Western diet observations being found on the left of the previous scatter plot.

  • Unclassified phyla present an opposite behavior to the diet variable.

  • Bacteroidetes and Firmicutes also exhibit opposite behaviors, consistent with their biological significance.

Analysis of Microbiome Clusters by Donor Groups Using Hierarchical Clustering

# Compute Euclidean distance matrix
dist_matrix <- otu_data_scaled |>
  dist()

# Perform hierarchical clustering
hclust_result <- hclust(dist_matrix, method = "ward.D2")

# Cut dendrogram into 3 clusters
cluster_labels <- cutree(hclust_result, k = 3) |>
  as_tibble() |>
  rename(Cluster = value)
# Perform chi-squared test
chi2_result <- chisq.test(donor_cluster_table)
chi2_result

    Pearson's Chi-squared test

data:  donor_cluster_table
X-squared = 659.91, df = 8, p-value < 2.2e-16
  • Cluster 1 is dominated by HMouseLFPP (55.5%) with notable contributions from Frozen (17.8%) and Fresh (18.7%), reflecting plant-rich diets and preserved samples.

  • Cluster 2 includes mostly Fresh (55.1%) and HMouseLFPP (26.5%), indicating a mix of human-derived and dietary influences.

  • Cluster 3 is almost entirely CONVR (95%), representing natural microbiota from control mice.

  • The chi-squared test confirms significant associations between donor origins and clusters, highlighting the influence of donors on microbiota composition.

Biodiversity and diet

Shannon diversity index

  • Number of species living in a habitat (richness)
  • Relative abundance (evenness).

\[ H' = -\sum_{i=1}^R p_i \ln p_i \\ p_i \text{ is the relative abundance of OTU}_i\\ R \text{ is the total number of OTUs} \]

Biodiversity in the microbiota of first-generation humanized mice was found to differ significantly across different diets.

Conclusion

  • The “Obesity-inducing” diet influences the Firmicutes-Bacteroidetes ratio
  • PCA shows how diet shapes microbial composition, as well as the relationship between different phyla.

  • Clustering shines light on how the microbiota donor structures the data

  • The Western diet favours a more biodiverse gut ecosystem